AD-A161  749 
UNCLASSIFIED 


STATIONARV  TINE  SERIES  QUANTILE  FUNCTIONS  NONPARAHETRIC  1/1 
INFERENCE  AND  RAN.  .  <U>  TEXAS  A  AND  H  UNIV  COLLEGE 
STATION  DEPT  OF  STATISTICS  A  HARPA2  SEP  85  TR-A-21 
ARO-20140  N00014-84-X-0350  F/G  12/1  NL 


OHS  FILE  COPY  AD- A 161  749 


Di'jiiirtiiiriif  nf 
STATIST!!  :s 
. . .  tO1)  H45  SI  II 


ft&o  &d/vo  y-/)/ 9 

TEXAS  A&M  UNIVERSITY 

COLLEGE  STATION,  TEXAS  77843-3143 


STATIONARY  TIME  SERIES,  QUANTILE  FUNCTIONS,  NONPARAMETRIC 
INFERENCE  AND  RANK  TRANSFORM  SPECTRUM 


by  Avrahatn  Harpaz 
Department  of  Statistics 
Texas  A&M  University 


Technical  Report  No.  A-31 
September  1985 


Texas  A&M  Research  Foundation 
Project  Nos.  5104  and  4858 


•’  i  5 

'  v.  -  ■  I 

•  :-w  i  *-| 

NOV  2  5  1935', 


"Multiple  Time  Series  Modeling  and  v/’ 

Time  Series  Theoretic  Statistical  Methods"  v 

and 

"Functional  Statistical  Data  Analysis  and  Modeling" 


Sponsored  by  the  Army  Research  Office  and 
Office  of  Naval  Research 


Contracts  N0001 4-84-K-0350  and  DAAG29-83-K-0051 
Professor  Emanuel  Parzen,  Principal  Investigator 
Approved  for  public  release;  distribution  unlimited 


_ Unclassified _ 

SECURITY  CLASSIFICATION  OF  THIS  PAOE  fWAen  Dal*  lalmO 


|  REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

S.  RECIPIENT'S  CATALOO  NUMBER 

4.  TITLE  fend  Subttllo) 

Stationary  Time  Series,  Quantile  Functions 
Nonparametric  Inference  and  Rank  Transform 
Spectrum 

s.  type  of  report  a  period  covered 

,  Technical 

S.  PERFORMING  ORG.  REPORT  NUMBER 

7.  AUTHOR  (o) 

Avraham  Harpaz 

A.  CONTRACT  OR  GRANT  NUMBERfe; 

N00014-84-K-0350 

DAAG29-83-K-0051 

f.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Texas  A&M  University 

Institute  of  Statistics 

College  Station,  TX  77843 

to.  PROGRAM  element,  project,  task 
AREA  A  WORK  UNIT  NUMBERS 

II.  CONTROLLING  OFFICE  NAME  AND  AODRESS 

IE.  REPORT  DATE 

September  1985 

IS.  NUMBER  OF  pages 

14.  MONITORING  AGENCY  NAME  A  AODRESSf"  different  Irani  Controlling  Office.) 

IS.  SECURITY  CLASS,  (ol  thle  report; 

Unclassified 

1S«.  DC  CL  ASSI  P|C  ATlON/  DOWNGRADING 
SCHEDULE 

IS.  DISTRIBUTION  STATEMENT  fof  til  It  Repor  I) 


Approved  for  public  release;  distribution  unlimited. 


17.  DISTRIBUTION  STATEMENT  fof  (tie  obotioel  entered  In  Black  20,  II  dlflaranl  hom  Report; 


NA 


IS.  SUPPLEMENTARY  NOTES 


IS.  KEY  WORDS  fCociflmie  an  reeeree  a  Ida  If  neceeenry  and  Idonlllr  67  Alack  niaitarj 

Stationary  Time  Series,  Quantile  Functions,  Nonparametric  Tests, 
Rank  Transform  Spectrum,  Wilcoxon  Tests,  Linear  Rank  Statistics. 


20.  AESI  PACT  fContlnue  on  reeeree  elde  If  neoeeeerp  end  IdentlfF  Ap  kloek  nuoiAor;  this  dissertation 

weak  convergence  results  for  dependent  sequences  are  used  to  derive 
the  asymptotic  distribution  of  linear  rank  statistics  for  the  two 
sample  problem.  It  is  shown  that  the  asymptotic  variance  of  linear 
rank  statistics  when  computed  from  two  independent  time  series 
depends  on  the  spectrum  of  the  rank  transform  time  series.  The 
behavior  of  the  rank  transform  spectrum  in  terms  of  its  relations 
to  the  original  spectrum  is  also  empirically  examined. 


DO  ,ET»  M73  COITION  OF  I  NOV  ••  II  OBSOLETE 
S/M  0I07*  LF*  014-  6601 


Unclassified 

•CCURITY  CLASSIFICATION  OF  THIS  PAOt  fWAen  DM  entered) 


iii 


ABSTRACT 


Stationary  Time  Series,  Quantile  Functions,  Nonparametric 
Inference  and  Rank  Transform  Spectrum.  (December  1985) 
Avraham  Harpaz,  B.S.,  AGEC  Hebrew  U.  of  Jerusalem; 

M.S.,  AGEC  Hebrew  U.  of  Jerusalem! 

Chairman  of  Advisory  Committees  Dr.  Emanuel  Parzen 


In  this  dissertation,  weak  convergence  results  for  dependent 
sequences  are  used  to  derive  the  asymptotic  distribution  of  linear 
rank  statistics  for  the  two  sample  problem.  It  is  shown  that  the 
asymptotic  variance  of  linear  rank  statistics  when  computed  from  two 
independent  time  series  depends  on  the  spectrum  of  the  rank  transform 
time  series.  The  behavior  of  the  rank  transform  spectrum  in  terms  of 
its  relations  to  the  original  spectrum  is  also  empirically  examined. 
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CHAPTER  I 


INTRODUCTION 

1.1  The  Problem 

Statistical  theory  can  be  divided  into  two  categories: 
independence  and  dependence.  In  the  independence  category,  the  basic 
setting  is  a  sequence  of  independent  random  variables  (r.v.s) 
(possibly  multivariate) .  This  category  contains  the  areas  of 
parametric  and  nonparametric  inference  and  estimation,  regression, 
analysis  of  variance,  multivariate  analysis  and  more.  In  the  second 
category,  the  basic  setting  is  a  sequence  of  dependent  r.v.s.  This 
category  contains  the  areas  of  time  series  analysis  and  stochastic 
processes  (Markov  chains,  renewal  theory,  etc.). 

There  is  a  growing  literature  on  the  extension  of  asymptci'5  - 
theorems  from  the  independence  case  to  the  dependence  case  „y 
introducing  concepts  of  asymptotic  independence.  Unfortunately,  most 
of  the  published  results  have  not  been  expressed  in  forms  that  are 
easy  to  apply  in  statistical  procedures.  In  this  work  our  goa!  is  to 
interpret  and  extend  the  theory  of  non-parametric  tests  for  time 
series  to  make  it  usable  in  practical  data  analysis  procedures.  In 
particular,  the  asymptotic  theory  of  linear  rank  statistics  in  the 
two  sample  problem  is  extended  to  express  the  asymptotic  variance  in 
an  interpretable  and  estimable  form.  Another  part  of  this  work  is  an 

This  dissertation  will  follow  the  format  for  the  Journal  of  the 
American  Statistical  Association. 


empirical  examination  of  the  properties  of  the  spectrum  of  rank 
transform  time  series  as  compared  to  the  spectrum  of  the  original 
time  series. 

1.2  Literature  Review 

The  empirical  process  and  the  quantile  process  serve  as  the  main 
ingredients  in  the  theory  and  procedures  considered  in  this  work. 
The  theory  of  the  weak  convergence  of  these  processes  to  Gaussian 
processes,  in  the  independent  case,  is  covered  in  great  detail  in 
Billingsley  (1968)  and  in  Csorgo  and  Revesz  (1981).  For  the  two 
sample  problem,  Pyke  and  Shorack  (1968)  introduced  a  two  sample 
empirical  process  which  can  be  used  to  represent  linear  rank 
statistics,  and  developed  the  asymptotic  distribution  of  their 
process  for  the  independence  case.  The  asymptotic  theory  in  the 
dependence  case  of  stochastic  processes  which  can  be  used  to 
represent  linear  rank  statistics  is  an  active  area  of  research. 
Billingsley  (,1968)  gave  the  basic  weak  convergence  result  for  the 
empirical  process  under  0-mixing  conditions.  Sen  (1971)  has  improved 
Billingsley's  result  by  weakening  the  required  mixing  conditions. 
Yoshihara  (1978)  and  Yokoyama  (1980y  hav«  successively  improved  this 
result  by  weakening  twice  more  the  required  mixing  conditions.  The 
weak  convergence  of  the  empirical  process  has  been  established  also 
for  two  other  notions  of  mixing  (see,  for  example,  Yokoyama  (1973) 
and  Mehra  and  Rao  (1975b)  for  results  under  strong  mixing  conditions 
and  Gastwirth  and  Rubin  (1975)  for  introduction  of  strong  mixing  As 
conditions  and  weak  convergence  results  under  those  conditions).  The 


weak  convergence  of  the  quantile  process  is  usually  established  via  a 
Bahadur  representation  type  result  (which  is  stronger  than  just 
implying  the  weak  convergence  of  the  quantile  process).  Babu  and 
Singh  (1978)  give  such  a  result  for  both  0-mixing  and  strong  mixing 
sequences. 

An  application  of  the  weak  convergence  of  the  empirical  process 
to  the  two  sample  problem  is  given  in  Fears  and  Mehra  (1974)  for 
^-mixing  sequences.  They  extend  Pyke  and  Shorack  (1968)  to  include 
the  dependence  case.  Mehra  and  Rao  (1975a)  applied  the  weak 
convergence  of  the  empirical  process  to  obtain  asymptotic  results  for 
functions  of  order  statistics  for  both  0-mixing  and  strong  mixing 
sequences.  Another  application  of  that  type  is  given  in  Falk  and 
Kohne  ( 1984 ) .  They  study  the  behavior  of  the  sign  test  under  mixing 
conditions  and  suggest  a  way  to  adjust  the  critical  region  to  the 
dependence  case. 

None  of  the  above  papers  take  the  approach  considered  in  this 
work,  which  is  to  express  the  asymptotic  variance  of  the  statistics 
studied  in  terms  of  the  spectral  density  of  some  related  time  series, 
which  we  call  the  rank  transform  time  series. 


CHAPTER  II 


ASYMPTOTIC  THEORY  FOR  EMPIRICAL  AND  QUANTILE  PROCESSES 

2.1  Distribution  Functions  and  Quantile  Functions  -  Definitions  and 

Basic  Properties 

Let  X  be  a  random  variable  (RV)  defined  on  a  probability  space 
(Q,A,P).  Then  its  distribution  function  (DF),  F,  is  defined  by 

F(x)  =  Pr(X<x)  =  P{w€0:X(o)<x} ,  -®<x<»  .  (2.1.1) 

The  quantile  function  (QF),  Q,  corresponding  to  F  is  defined  by 

Q(u)  =  F'^u)  =  inf {xefl:F(x)>u}  ,  O^uSl  .  (2.1.2) 

Note  that  Q(0)  is  formally  and  when  F(x)<l  for  all  xtiR  then  Q(l) 
is  formally  +“. 

When  F  is  continuous  and  strictly  increasing  then  Q  is  its  true 
inverse.  In  general,  however,  we  have 

F(x)  =  sup{ue [0, 1] :Q(u)Sx} ,  -®<x<“  .  (2.1.3) 

From  these  definitions  it  follows  that  F  is  right -continuous,  Q  is 
left-continuous  and  both  F  and  Q  are  nondecreasing.  Some  of  the 
basic  relationships  between  F  and  Q  are  summarized  in  the  following 
theorem. 

Theorem  2.1.1  :  Let  F  be  a  DF  with  corresponding  quantile 


function  Q.  Then 
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a)  F(Q(u) )  >  u,  0<u<l  ,  (2.1.4) 

with  equality  if  F  is  continuous  at  Q(u). 

b)  F(x)  >  u  iff  Q(u)  <  x  .  (2.1.5) 

Prcof\  The  proof  of  these  standard  facts  is  given  here  in  order 
to  illustrate  the  methods  by  which  one  can  relate  the  properties  of 
distribution  functions  and  quantile  functions. 

a)  First  observe  that  for  u=0  ,  2.1.4  is  satisfied  trivially  since  F 
is  nonnegative.  Also,  when  Q(u)=+®,  then  F(Q(u) )=F(+®)=l>u  since 
O^u^l.  Next,  for  0<u^l  and  when  Q(u)<+®,  it  follows  from  the 
definition  of  Q  that 

F(Q(u)-e)  <  u  ^  F(Q(u)+e),  any  e>0  . 

Hence,  letting  e— »0  and  using  the  right-continuity  of  F,  we  have 

F(Q( u) )  >  u  , 

and  if  F  is  continuous  at  Q(u)  we  also  have 

F (Q(u) )  <  u  , 


which  gives  the  equality. 

b)  Assume  that  Q(u)  5  x.  Then  the  monotonicity  of  F  and  part  a) 
above  give 

F(x>  >  F(Q( u ) )  >  u  . 

Now  assume  that  F(x)  >  u  .  Then 


x  e  {z:F(z)£u} 


and  hence,  using  the  definition  of  Q, 


Q(u)  <  x  . 


Let  IA  be  the  indicator  function  of  a  set  A  defined  by 


Q.E.D 


{1  if  teA 

(2.1.6) 

0  if  UA  . 


Let  X(1),X(2), . . . ,X(n)  be  RVs  defined  on  a  common  probability  space 
(fi, A, P)  and  having  common  DF  F  and  QF  Q.  The  sample  DF  Fn  (also 
called  the  empirical  DF  and  denoted  EOF)  is  defined  by 

Fn(x)  =  (l/n)iZiI(_.<x)(X(i))  ,  -axx<®  .  (2.1.7) 

The  corresponding  sample  QF  Qn  is  defined  by 

Qn(u)  =  inf {x:Fn(x)£u}  ,  0<u£l  .  (2.1.8) 

In  terms  of  X(l:n) ,X(2:n> , . . . ,X(n:n > ,  the  order  statistics 
corresponding  to  X(l) ,X(2) , . . . ,X(n) ,  a  formula  for  Qn  is  : 

Qn(u)  =  X(j:n)  for  (j-l)/n  <  u  £  j/n  ,  j=l,...,n  .  (2.1.9) 


Note  that  Qn(0)  was  left  undefined  by  the  above  definition.  Parzen 
(1979)  suggests  that  Qn(0)  be  taken  to  be  a  natural  minimum  when  one 
is  known  (e.g.  for  nonnegative  RV  one  can  take  Q„(0)  =  0). 
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2.2  Asymptotic  Distribution  of  Fn  and  Qn  :  Independent  Case 

In  this  section  we  assume  that  the  RVs  X(l) , . . . , X(n)  are 
independent  and  present  some  of  the  basic  properties  of  Fn  and  Qn  as 
estimators  of  F  and  Q  respectively.  The  case  of  dependent  RVs  (i.e. 
time  series)  will  be  treated  below. 

For  fixed  xefl,  Fn(x)  is  the  average  of  independent 
and  identically  distributed  (IID)  RVs  each  distributed  as  B(l,F(x)) 
(Binomial  distribution  with  parameters  1  and  F(x)).  Therefore, 

nFn(x)  -  B(n,F(x) )  , 

E[Fn(x)]  =  F(x)  , 

Var [Fn(x) ]  =  (l/n)F(x) ( l-F(x) )  . 

Using  the  properties  of  the  binomial  distribution,  the  strong  law  of 
large  numbers  (SLLN)  and  the  IID  version  of  the  central  limit  theorem 
(CLT) ,  we  immediately  have  the  following  result. 

Theorem  2.2.1  :  Consistency  and  Asymptotic  Normality  of  the 

Sample  Distribution  Function.  For  any  fixed  xefl, 

a)  Fn(x)  ->•  F(x)  (2.2.1) 

where  the  convergence  holds  in  probability  (denoted  £-*) ,  in 
quadratic  mean  (denoted  45)  and  almost  surely  (denoted  ^4). 

b)  n1/2[Fn(x)-F(x)]  N(0,F(x) (l-F(x) ) )  .  (2.2.2) 

For  Qr  we  have  the  following  results. 
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Theorem  2.2.2  :  Consistency  of  the  Sample  Quantile  Function.  Let 
0<u<l  and  assume  that  Q  is  continuous  at  u.  Then 

Qn(u)  *4  Q(U)  .  (2.2.3) 

Proof :  Let  e>0.  Then  by  the  definition  of  Q  and  the  assumed 
continuity  of.  Q  at  u,  we  have 

F(Q(u)-c)  <  u  <  F(Q(u)+e)  . 

From  Theorem  2.2.1  (a)  it  follows  that 

Fn(Q(u)-e)  -34  F(Q(u)-«) 

and 

Fn(Q(u)+« )  ±4  F(Q(u)+£)  . 

Hence, 

Pr [Fni(Q(u)-€)  <  u  <  Fn(Q(u)  +  e)  ,  all  m£n]  — »  1  as  n— »»  . 

Thus,  by  Theorem  2.1.1  (b), 

Pr [ (Q(u)-« )  <  Qn(u)  £  (Q(u)  +  €)  ,  all  n£n]  — *  1  as  n— , 
which  is  equivalent  to 

Pr[sup|Q_(u)-Q(u) I  £f]  — *  1  as  n— , 

min 

giving  the  result. 


Q.E.D 
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The  asymptotic  normality  of  Qn(u)  (at  fixed  0<u<l)  can  be 

established  under  the  assumption  that  F  possesses  a  density  f  in  a 
neighborhood  of  Q(u)  and  that  f  is  positive  and  continuous  at  Q(u) 
(see  Serfling  (1980),  Theorem  A  p.  77).  An  alternative  approach 
(adopted  in  this  work)  is  to  assume  that  the  density  f  is 

differentiable  at  Q(u)  and  use  the  Bahadur  representation.  This 
important  result,  first  given  by  Bahadur  (1966),  is  stated  in  the 

following  theorem  (for  a  proof,  see  Serfling  (1980),  pp.  91-95). 

Theorem  2.2.3  :  Bahadur  Representation .  Let  0<u<l.  Assume  that  F 
is  twice  differentiable  at  Q(u),  with 

fQ(u)  =  f (Q(u) )  =  F ' (Q(u) )  >  0  . 

Then 

Qn(u)  =  C(u)  -  [Fn(Q(u))-uJ/fQ(u)  +  Rn  ,  (2.2.4) 

where  with  probability  one 

Rn  =  O(n'3/4(log(n))3/4)  ,  as  n— ►<»  .  (2.2.5) 

The  asymptotic  normality  of  Qn(u)  can  be  proved  as  an  immediate 
consequence  of  the  Bahadur  representation  and  the  asymptotic 
normality  of  Fn. 

Theorem  2.2.4  :  Asymptotic  Normality  of  the  Sample  Quantile 

Function.  Assume  that  f  is  differentiable  at  Q(u)  with  fQ(u)>0.  Then 

-  N(0,o2) 


■V.yj/2 


n1/2[Qn(u)-Q(u)  ]  4 


(2.2.6) 
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where 

o2  =  u(l-u)/[fQ(u)]2  .  (2.2.7) 

Proof’.  By  Theorem  2.2.3,  we  have 

n1/2[Qn(u)-Q(u)]  =  -n1/2[Fn(Q(u))-u]/fQ(u)  +  n1/2Rn 

and 

n1/2Rn  £-♦  0  . 

The  result  now  follows  using  Theorem  2.2.1  (b),  Slutsky's  Theorem  and 
the  fact  that 


if  Z  -  N(0,o2)  then  -Z/a  ~  H(0,o2/a2)  . 

Q.E.D. 

2.3  Empirical  and  Quantile  Processes  -  Independent  Case 

The  behavior  of  Fn  and  Qn  when  regarded  as  random  functions  or 
stochastic  processes  is  the  theoretical  basis  of  modern  approaches  to 
deriving  the  distribution  theory  for  many  statistical  techniques. 
Define  the  empirical  process  {cn(u),0SuSl}  by 

cn(u)  =  n1/2[Fr(Q(u))-F(Q(u))]  ,  0Su<l  ,  (2.3.1) 

and  the  quantile  process  {0n(u) ,  OSuSl}  by 

0n(u>  =  n1/2[Qn(u)-Q(u)]  ,  0£u<l  .  (2.3.2) 

First  we  present  an  asymptotic  normality  type  result  for  an.  The 
ideas  and  techniques  required  for  this  type  of  result  are  those  of 
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convergence  of  probability  measures  on  metric  spaces.  A  detailed  and 
comprehensive  treatment  of  this  topic  can  be  found  in  Billingsley 
(1968).  In  this  work  we  use  these  concepts  and  results  somewhat 
heuristically . 

Lemma  2.3.1  :  Let  U(1),IH2), — ,U(n)  be  IID  RVs  with  common  DF 
G  and  such  that 

0  ^  U(i,w)  ^  1  ,  ueft,  i=l, . . . ,n  . 

Define  the  random  element  of  D(0,1),  a“,  by 

aJJ(u)  =  n1/2[Gn(u)-G(u)  ]  ,  0*u<l  , 

where  Gn  is  the  EDF  corresponding  to  U(l) , . . . ,U(n) .  Then 

d.  _U 

an  — *  o  # 

where  au  is  the  Gaussian  random  element  of  D(0,1)  specified  by 

E{au(u)}  =  0  0Su<l  , 

E{a’J(u) -cu(v)}  =  G(u) (l-G(v) )  ,  0<u<v<l  . 

Proof'.  See  Billingsley  (1968)  Theorem  16.4  p.  141  . 

Remark:  When  U(l), — ,U(n)  are  uniformly  distributed  on  [0,1] 

(U(0,1)'>  then  G(u)=u,  OSu^l  and  au  is  distributed  as  the  Brownian 
bridge  process. 

Theorem  2.3.1  :  Asymptotic  Distribution  of  the  Empirical  Process. 

Let  X(  1) , . . .  ,X(n)  be  IID  RVs  with  common  DF  F  and  let  an  be  the 
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corresponding  empirical  process  as  defined  by  2.3.1.  Then 


where  a  is  the  Gaussian  random  element  of  D(0,1)  specified  by 


E{a(u)}  =  0  0<u*l  , 


E{o(u) *a(v)}  =  P(Q(u))[l-P(Q(v)>]  ,  0£u£v<l  . 


Proof :  Define  the  RVs 


U(i)  =  F(X(i>>  ,  i=l, . . .fn 


Then  U( 1) , ... ,U(n)  are  IID  RVs  with  common  DF 


G(u)  -  F(Q{u>)  ,  0<u<l  . 


The  EDF  corresponding  to  U(l) , . . . ,U(n)  is  given  by 


Gn(u)  =  Fr(Q(u))  ,  OSuSl 


Lemma  2.3.1  now  implies 


-U  .d.  „u 

an  a  ' 


where  aR  and  cu  are  defined  as  in  Lemma  2.3.1  but  in  terms  of  G  and 


G„  defined  here.  But 


u  .  d  u 
an  an  and  a  *  au  , 


giving  the  result, 


Q .  E  .D . 


.  ■  ,  '  .  1  •  v  V  •»  *  .  ^  k  i  •.<_»  ■  *  •  •  •  » "  .  *  «  *  *  •  «  n  -  •  »  ’  .  i  .  • 

.  V-  >y  :  .•.  •.■'•.•/a  .-.  ■  v 


Note  that  when  F  is  continuous  then  a  is  distributed  as  a  Brownian 


bridge  process. 

The  asymptotic  distribution  of  0n  is  derived  from  a  result  that 
can  be  regarded  as  an  extention  of  the  Bahadur  representation  as 
given  in  Theorem  2.2.3.  This  result  was  first  established  by  Kiefer 
(1970)  and  then  extended  to  the  general  case  by  Csdrgo  and  Revesz 
(1981).  We  list  here  this  later  result  without  proof. 

Theorem  2.3.2  :  Let  X(l) , . . .  ,X(n)  be  IID  RVs  with  a  common  DF  F 
which  is  twice  differentiable  on  (a,b)  where 

-®  S  a  =  sup{x:F(x)=0}  and  +«  >  b  =  inf {x:F(x)=l} 


and 


F'  =  f  >  0  on  (a,b)  . 

Assume  that  F  also  satisfies 

sup  {F(x) ( l-F(x) ) |f' (x)/f2(x) 1}  <y  for  some  t>0  ,  (2.3.3) 

a<x<b 

and 


f  is  nondecreasing  (nonincreasing >  on  an  interval 
to  the  right  of  a  (to  the  left  of  b)  . 


(2.3.4) 


Let 


Rn  =  sup  |(Fn(Q(u))-u)-(Q(u)-Qn(u))fQ(u)| 

OSuSi 


(2.3.5) 
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limsup  [n'3/4(log(n)  )1/2(loglog(n)  )1/1]~1Rna=s2~1/'1  .  (2.3.6) 

n-»* 

Proof'.  See  Csbrgo  and  Revesz  (1981)  Theorems  4.5.6  p.  149, 
5.2.1  and  5.2.2  p.  160. 

Remark :  This  result  goes  much  beyond  our  needs  here.  We  will 
extract  from  it  the  following  two  results  about  quantile  processes 
(which  are  only  part  of  the  statement  of  the  theorem) . 

Theorem  2.3,3  :  Asymptotic  Distribution  of  the  Quantile  Process. 

Under  the  conditions  of  Theorem  2.3.2,  we  have 

a)  fQ(  *  )0n( • )^*0( * )  ,  (2.3.7) 

where  <3  is  distributed  as  a  Brownian  bridge  process. 

b)  0a=s-a  ,  (2.3.8) 

where  a=s  means  that  there  exists  fl1c0,  with  P(ft1)=l  and  such 
that  for  any  utQ1  0(u,w)=-a(u,u)  O^SuJSl  . 

Proof'.  Immediate  from  the  convergence  of  the  empirical  process 
(Theorem  2.3.1)  the  extended  Bahadur  representation  (Theorem  2.3.2) 
and  Theorem  4.1  p.  25  of  Billingsley  (1968). 

2.4  Empirical  and  Quantile  Processes  -  Dependent  Case 

In  this  section  we  assume  that  the  RVs  X(l), .. .fX(n)  come  from  a 
stochastic  process  (SP)  {x}={x(i) :i«Z} ,  where  Z  is  the  set  of 
integers.  We  first  define  the  notions  of  stationarity  and  asymptotic 
independence  required,  and  then  present  some  asymptotic  results  for 
the  empirical  and  quantile  processes. 


Defi  nit  ions: 


a)  The  SP  {X(i):i«z}  is  said  to  be  (strictly)  stationary  if  for  any 

kSO,  i1<i2<...^ik  and  any  j^O,  the  vectors  (X^) , . . .  ,X(ik) )  and 
(X(ij+j ) , . . . ,X(ik+ j) )  are  identically  distributed. 

b)  For  Icz,  we  denote  by  Bz  the  c-field  generated  by  the  RVs 

{X(i):icl}.  We  now  introduce  the  notion  of  mixing.  The  SP 

{X(i):icZ}  is  said  to  be  p-mixing  if  there  exists  a  positive 
integer  M  and  a  function  p  for  which  p(ra)— *0  as  ro — ►«  such  that 

|P(AnB)-P(A)P(B)  |  <  <*>(m)P( A)  (2.4.1) 

whenever  m>M,  neZ,  AeSj  and  BcB  j  where  I={i:i<n}  and 

J={  j : j>m+n} . 

Note  that  when  P(A)>0  then  2.4.1  is  equivalent  to 

|P(B|A)-P(B) I  S  0(m)  (2.4.2) 

while  when  P(A)=0  then  2.4.1  is  satisfied  trivially.  Hence 
taking  the  left  side  of  2.4.2  as  0  when  P(A)=0  one  can  define 
the  mixing  function  0(m)  by 

0(rn)  =  sup{ |P(B|A)-P(B)|:AeBI  ,  BtBjj  ,  n£0  (2.4.3) 

where  I,J  and  n  are  as  in  2.4.1.  Then  one  places  assumptions  on 
0  in  order  to  obtain  asymptotic  results. 

Remark :  Other  related  notions  of  mixing  have  been  introduced  in  the 

literature.  Two  of  them  use  the  same  l.h.s  (left  hand  side)  of  2.4.1 
but  put  other  bounds  on  the  r.h.s  (right  hand  side).  One  is  the  weak 
^-mixing,  where  the  r.h.s  of  2.4.1  is  replaced  by  p(m)  alone  and  the 
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other  one  is  the  strong  0-mixing  in  which  the  r.h.s  of  2.4.1  is 
replaced  by  <t>(m)P(k)P(B ) .  Another  measure  of  dependence  is  the  one 
introduced  by  Gastwirth  and  Rubin  (1975).  Their  definitions  and 
results  are  particularly  appropriate  for  time  series  which  might  not 
be  0-mixing.  However,  their  conditions  for  the  weak  convergence  of 
the  empirical  process  do  not  cover  all  the  time  series  models  we 
would  like  to  consider.  We  present  here  the  results  for  processes 
satisfying  only  0-mixing  conditions.  We  do  not  discuss  further  these 
conditions  because  our  approach  is  only  to  illustrate  the  kinds  of 
conditions  required  to  prove  asymptotic  normality  in  the  dependent 
case.  These  results  are  used  in  our  statistical  analysis  procedures 
only  heuristically  since  we  do  not  verify  that  the  mixing  conditions 
assumed  in  this  section  are  satisfied  by  the  time  series  that  we 
observe. 

The  main  result  concerning  the  asymptotic  behavior  of  the 
empirical  process  for  0-mixing  RVs  is  given  in  Billingsley  (1968). 
This  result  has  been  shown  to  hold  under  weaker  conditions  by  Sen 
(1971).  It  has  also  been  shown  to  hold  under  other  definitions  of 
mixing  structures.  We  will  list  here  Billingsley's  result  and 
mention  Sen's  improvement.  This  will  fulfill  the  needs  of  our 
heuristic  approach.  Before  listing  the  theorem,  we  first  define  the 
notion  of  dependence  DF . 

Definition :  Let  (Xx,X2)  be  a  bivariate  RV  with  DF  F(*,«)  and  marginal 

DFs  Flf  F2  with  corresponding  QFs  Q1(  Q2.  Then  the  dependence  DF 
of  (X1#X2)  is  defined  by 


B(u^ ,  U2 )  —  F(Q^(u2),Q2(u2))  9 


0^u1,u2^l 


(2.4.4) 


Note  that  the  dependence  DF  is,  in  fact,  the  DF  of  the  bivariate 
RV  where  U1=F1(X1)  and  U2=F2(X2). 

Theorem  2.4.1  :  (Billingsley  (1968))  Let  {X(i):ieZ}  be  a 

stationary  0-mixing  process  with  a  continuous  marginal  DF  F  and  such 
that 

ln2[0(n)  ]1/z  <  »  .  (2.4.5) 

1 

Then 

an  aF  (2.4.6) 

where  an  is  the  empirical  process  corresponding  to  {x}  and  aF  is  the 
Gaussian  random  function  specified  by 

E{aF(u)}  =  0  ,  0<u<l  ,  (2.4.7) 

E{aF(u)aF(v)}  =  Kb(u,v)  ,  (2.4.8) 

where  KB(u,v)  is  the  dependence  distribution  covariance  kernel  of  the 
process  {X(i):i«z},  defined  by 

as 

KB(u,v)  =  u^v-uv  +  2Z[Bk (u, v)-uv]  ,  0^u,v<l  ,  (2.4.9) 

in  terms  of  Bk,  the  dependence  DF  of  (X(i) ,X(i+k) ) . 

Proof:  See  Billingsley  (1968)  Theorem  22.1  p.  197  . 

Remark:  Sen  (1971)  showed  that  the  same  result  holds  under  the 


weaker  condition  that 
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In[tf(n)]1/2  <  -  .  (2.4.10) 

1 

The  asymptotic  normality  of  the  quantile  process  is  derived  in 
the  same  manner  as  in  the  independent  case  using  a  Bahadur 
representation  type  result  for  mixing  processes.  A  result  of  that 
type  is  given  in  Babu  and  Singh  (1978). 

Theorem  2.4.2  :  Bahadur  Representation  for  Mixing  Processes.  Let 
{x(i):i«z}  be  a  stationary  ^-mixing  process  with  Z[^(k) ]1/2<®. 

l 

Define  the  following  conditions  for  the  underlying  DF  F  and  its 
density  f. 

Condition  1:  For  some  interval  I,  f’  exists  and  is  bounded  on  I,  f 
vanishes  outside  I,  inf {f (x) :xc I } >0  and  sup{f (x) :xel}<®  . 

Condition  2:  For  some  O^a^b^l  and  «>0,  condition  1  is  satisfied  for 
the  interval  I=[Q(a)-€,Q(b)+e],  except  that  f  need  not  vanish 
outside  that  interval. 

Let 


cn  =  n~3/4 [log(n) ]1/2[loglog(n)  ]1/4  , 

Rn(u)  =  [Fn(Q(u))-u]-fQ(u)[Q(u)-Qn(u)]  , 

R“’  *  =i;?1lRn(,J)l  • 

Then,  under  condition  1, 

limsup  c^R;11  £  C,  a.s  ,  (2.4.11) 

n-*» 

and,  under  condition  2, 


(2.4.12) 


limsup  cn  i  C2  a.s  , 

n-*« 

where  Cx  and  C2  are  positive  constants. 

Proof:  See  Babu  and  Singh  (1978)  Theorem  7  and  Remark  4.2  . 

The  asymptotic  normality  is  now  given  by  the  following  theorem. 

Theorem  2.4.3  :  Asymptotic  Normality  of  the  Quantile  Process.  Let 
{x(i):ieZ}  be  a  stationary  0-mixing  process  with  Zk2[0(k) ]1/2<®.  Let 
Pn  be  the  quantile  process  formed  from  a  realization  of  length  n  from 
{x}  and  let  j3F=-aF  where  aF  is  the  Gaussian  process  defined  in 
Theorem  2.4.1.  Then,  under  condition  1  of  Theorem  2.4.2,  we  have 

{fQ(u)0n(u)  :0<u<l}  {0F(u):O<u<l}  ,  (2.4.13) 

while  under  condition  2  of  Theorem  2.4.2,  we  have 

{fQ(u)(3n(u):a<u<b}  ^  {0F(u)  :a<u<b}  .  (2.4.14) 

Proof:  Immediate  from  the  convergence  of  the  empirical  process 

(Theorem  2.4.1),  the  Bahadur  representation  (Theorem  2.4.2)  and 

Theorem  4.1  p.  25  of  Billingsley  (1968). 

2.5  Spectral  Density  Interpretation  of  KB(u,v) 

Let  {x}  be  a  stationary  time  series  with  absolutely  summable 
correlation  function  px( k)  defined  by 

px(k)  =  Corr [X(i) ,X(i+k) ]  ,  i,k*Z  .  (2.5.1) 


Let  f(o;X)  be  the  corresponding  spectral  density  function  defined  by 
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f(w;X)  =  l+2Zpx(k)cos(2ffkw)  ,  0&j<1  .  (2.5.2) 

Let  xn  be  the  mean  of  a  sample  X(l) , . . .  ,X(n)  from  {x} .  Assume 
that  the  time  series  {x}  obeys  the  conditions  for  xn  to  be 
asymptotically  normal.  Then 

n1/2[xn-px]  $-*  N(0,o2)  (2.5.5) 

where  ux  is  the  common  mean,  ux=E[X(i)],  and  the  asymptotic  variance 
a2  can  be  expressed  in  terms  of  the  value  at  zero  frequency  of  the 
spectral  density  function  f(u;X): 

o2  =  Var [X(l> ]f (0;X)  .  (2.5.4) 

The  formula  for  a 2  can  be  derived  directly  by  calculating  the 
variance  of  xn  and  taking  the  appropriate  limit.  We  would  like  to 
present  here  a  method  for  deriving  results  like  the  above  which  will 
be  used  later  to  derive  the  asymptotic  variance  of  lin*»>i  rank 
statistics. 

Let  g  be  a  function  such  that  the  time  series  {g(X)}  possesses  a 
spectral  density  function,  and  assume  that  the  time  series  {x} 


satisfies  the  conditions  for  the  asymptotic  normality  of  the 
empirical  process  cn.  Define 


Tr (g)  =  (l/n)  I  g(X(i) ) 

r‘  i*l 


(2.5.5) 


=  Jg(x)dFn(x)  =  Jg(Q(u))dFn(Q(u)>  , 

r  0 


Vg>  =  n1/2[Tn(g)-/g(Q(u))du] 


(2.5.6) 


=  Jg(Q(u) )da  (u)  . 

c 


An(9)  A(g)  *  /g(Q(u))dc(u) 
o 


(2.5.7) 


and  A(g)  is  a  N(0,a2)  RV  whose  variance  can  be  expressed  as 


oj  =  Var [g(X) ]f (0;g(X) ) 


(2.5.8) 


where  f(*;g(X))  is  the  spectral  density  of  the  time  series  {g(X)} . 
We  derive  2.5.8  by  a  method  to  be  used  below  for  linear  rank 
statistics: 


f  fg(Q(u) )g(Q(v) )dKa(u, v) 

00  " 


=  /[g(Q(u))3zdu  -  {/g(Q(u))du}2 
o  0 


+  21/ /g(Q(u) )g(Q(v) ) [bk(u,v)-l3dudv 

100 


=  Var [g(X; ] {l+2ZCorr[g(X(l) ) ,g(X(l+k) ) ]} 


=  Var [g(X) ]f (0;g(X) ) 


(2.5.9) 


where  hK  is  the  dependence  density  function  corresponding  to  the 
dependence  DF  Bk.  Note  that  taking  g(x)=x  in  2.5.9  gives  the  result 
2.5.4  for  x  as  a  special  case. 
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CHAPTER  III 

ASYMPTOTIC  THEORY  FOR  TWO-SAMPLE  LINEAR  RANK  STATISTICS 

3.1  General  Settings  and  Assumptions 

We  consider  two  samples  X(l) , . . . ,X(m)  and  Y(l) , . . . , Y(n) 
respectively  representing  finite  realizations  from  two  strictly 
stationary  stochastic  processes  {x(i):ieZ}  and  {Y(j):jeZ},  where 
Z={0,±1,±2, . . .}  is  the  set  of  all  integers. 

Assumption  1 :  The  two  processes  are  independent. 

Assumption  2:  The  univariate  marginal  DFs  of  the  two  processes  are 
respectively  given  by 

F(x)  =  Pr[X(i)Jsx]  ,  ieZ  ,  xefl 

and 

(3.1.1) 

G(y)  =  Pr[Y(j)<y]  ,  j«Z  ,  ytfi  . 

We  assume  that  F  and  G  are  continuous  and  denote  their  corresponding 
QFs  by  Qf  and  QG  respectively. 

Remark'.  One  often  desires  to  test  the  null  hypothesis 
H0 :F(x)=G(x)  all  x. 

Assumption  3:  We  further  assume  that  each  of  the  processes  {x}  and 
{y}  satisfies  the  conditions  on  the  dependence  structure  and  on  the 
smoothness  of  the  DFs,  required  for  the  convergence  of  the  empirical 
and  quantile  processes  to  the  appropriate  Gaussian  processes,  as 
described  in  the  preceding  chapter. 


To  compare  the  univariate  marginal  distributions  of  the  two  time 
series,  we  will  follow  the  development  as  in  Parzen  (1933)  adjusting 
it  to  the  case  of  time  series.  Additional  assumptions  will  be  made 
as  needed  along  the  way. 

3.2  Definitions  and  Notations 

As  above,  we  let  X(l), _ ,X(m)  and  Y(l) , . . . ,Y(n)  be  the  observed 

samples  and  F,  G,  Q?  and  QG  be  the  marginal  DFs  and  QFs  respectively. 


We  define  the  sample  functions  F,  G,  QF  and  QG  by 

F(x)  =  (1/m)  riI(_.<x](X(j))  — o°<x<co  ,  (3.2.1) 

Qf(u)  =  inf {x:F(x)>u}  0<u<l  ,  (3.2.2) 

G(x)  =  (1/n)  2  I(_.fX](Y(i))  -«xx<®  ,  (3.2.3) 

Qg(u)  =  inf {x:G(x)>u}  0Su<l  .  (3.2.4) 

Next  we  let  N=m+n,  XN=m/N  and  define 

H(x)  =  XnF(x)  +  (1-Xn)G(x)  ,  -®<x«*  (3.2.5) 


to  be  the  EDF  of  the  pooled  sample  X(l) ,  . . .  ,X(m)  ,Y(1) ,  . . .  ,Y(n) .  We 
assume  that  XN— *X  as  N— *®,  with  0<X<1,  and  define 

H(x )  =  Hx(x)  =  XF(x)  +  vl-X)G(x)  ,  -axxo  .  (3.2.6) 

In  general,  we  define  the  inverse  of  a  nondecreasing,  right  or  left 
continuous  function,  D(t>,  by 


if  D  is  right-continuous,  and  by 

D  X(u)  =  sup{t:D(t)£u}  (3.2.8) 

if  D  is  left-continuous. 

Next  we  define  the  basic  comparison  functions 

D^u)  =  HQp(u)  *  H[F'1(u)]  ,  (3.2.9) 

as  estimator  of  Dx(u)=HQF(u) ,  and 

D(u)  =  Di1(u)  ,  (3.2.10) 

as  estimator  of  D(u)=Di1(u)=FQH(u)  .  Observe  that  Dx  is  left- 
continuous,  so  that,  5  is  right-continuous.  Observe  also  that  the 

Pyke-Shorack  (1968)  sample  function,  FQH(u),  which  is  another 

estimator  of  D(u),  is  not,  in  general,  equal  to  D(u).  This  can  be 
seen  from  the  following  explicit  formulas  for  D(u)  and  FQH(u)  given 
in  Parzen  (1983): 

D(u)  =0  ,  0  S  u  <  (Rj^/N)  , 

*  j/m  ,  (Rj/N)  <,  u  <  (Rj+1/N)  ,  j*l,  ...,m-l  ,  (3.2.11) 


*  1  ,  (VH>  *  U  <  1  ; 


L 


25 


FQh(u)  =0  ,  0  <  u  £  (R,-l)/N  , 

=  j/m  ,  (Rj-D/H  <  u  £  (RJ+1-i)/N  ,  j=l, . . . ,m-l  ,  (3.2.12) 
=  1  ,  (Rffl-1)/N  <  u  £  1  . 

where  R^  or  R(j),  for  j=l, ...,m,  is  defined  to  be  the  rank  in  the 
pooled  sample  of  X(j:m),  the  jth  order  statistic  in  the  X-sample. 
More  precisely, 


is  of  particular  importance  to  the  development  of  the  asymptotic 


distribution  of  TN(J)  since  it  exhibits  TN(J)  as  a  linear  functional 
of  the  process  D(u).  One  expects  that  the  asymptotic  distribution  of 
Tn(J)  is  that  of  the  same  linear  functional  of  the  process  which  is 
the  limit  of  D(u)  (appropriately  normalized). 

Before  turning  to  the  development  of  the  asymptotic  distribution 
of  D(u),  we  define  four  more  functions  that  make  the  presentation  of 
the  results  more  symmetric  and  easy  to  follow.  Those  functions  are 

Df(u)  =  FQh(u) 
dF(u>  =  Dp (u) 

DG(U)  =  GQk(u) 
dG(u)  =  DG(u)  . 

Note  that  DF(u)=D(u)  and  Dg(u)=(1/(1-X) ) (u-XD(u) ) .  The  introduction 
of  the  new  functions  is  indeed  not  necessary  and  is  used  only  for 
convenience. 

3.3  Asymptotic  Distribution  of  D(u) 

In  this  section  we  develop  the  asymptotic  distribution  of  D(u) 
for  the  case  described  in  Section  3.1,  i.e,  when  the  two  samples  come 
from  two  independent  time  series  satisfying  the  conditions  for  the 
weak  convergence  of  the  coresponding  empirical  processes.  Let  aP  and 
a0  be  the  empirical  processes  corresponding  to  {x}  and  {y} 
respectively,  i.e, 


aF(u)  =  mA/  [PQf(u>-u]  ,  O^uil 


(3.3.1) 


and 

aG(u)  =  n1/2[GQG(u)-u]  ,  0£u<l  . 

Then  we  assume  that 


and 


a 


F 


I 


where  aF  and  aG  are  zero  mean  Gaussian  processes  with 
kernels  given,  respectively,  by 


Ka  (u,v)  =  Cov{aF(u),aF(v)}  =  K0(u,v)  +  KF(u,v)  , 
Kqg (u» v)  =  Cov{aG(u) ,aG(v)}  =  K0(u,v)  +  KG(u,v)  , 

where 

K0(u,v)  =  u^v-uv  , 

CD 

Kf(u,v)  =  2£[Bk(u,v;F)-uv]  , 

a 

KG(u,v)  =  2Z[3k(u,v;G)-uv]  , 

and  Bk  ( • ,  • ;  F )  and  Bk(*,*;G)  are  the  dependence  DFs  of  lag 
processes  {x}  and  { Y}  respectively. 

Next  define  the  comparison  processes  7  and  6  by 

7(u)  =  N1/z[FQH(u)-D(u)]  ,  0<u<l  , 

and 

6(u)  =  N1/2[D(u)-D(u) ]  ,  0Su<l  . 


(3.3.2) 

(3.3.3) 

(3.3.4) 

covariance 

(3.3.5) 

(3.3.6) 

(3.3.7) 

(3.3.8) 

(3.3.9) 

k  for  the 

(3.3.10) 

(3.3.11) 
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Now,  using  the  explicit  representations  for  FQH  and  D  as  given  in 
3.2.11  and  3.2.12,  it  is  easy  to  see  that 


sup{ |7<u)-5(u) | :0^u^l}  5  2*m 


1/2  -1/2 


(3.3.12) 


Hence,  the  two  comparison  processes  have  the  same  limiting  behavior. 
Therefore,  asymptotic  results  about  y  can  be  immediately  applied  to 


Pyke  and  Shorack  (1968)  have  studied  the  asymptotic  behavior  of 
7  for  the  independence  case  and  Fears  and  Mehra  (1974)  have  extended 
their  results  to  include  the  case  of  mixing  processes.  Here  we 
sketch  the  main  steps  in  the  development  of  these  results.  Using 
Lemma  3.1  of  Pyke  and  Shorack  (1968)  we  have 


where 


7(u)  =  (l-XN){X„'1/zdG(u)oF(FQH(u)) 

-(l-XNr1/2dF(u)£G(GQH(u))}  -  R(u)  , 


dF(u)  =  [Df(HQh(u))-Df(u)]/[HQh(u)-u]  , 


dG(u)  =  [Dg(HQh(u))-Dg(u)]/[HQh(u)-u]  , 


R(u)  =  dF(u)N1/Z[HQH(u)-u]  . 


(3.3.13) 


(3.3.14) 


(3.3.15) 


(3.3.16) 


Note  that  only  algebraic  manipulations  are  involved  in  the  derivation 
of  the  above  representation  for  y,  so  that  it  is  valid  under  any 
dependence  structures  of  the  underlying  processes  {x}  and  {Y}.  Next 
observe  that  dF  and  dG  are  related  by  the  equation 


XNdF(u)  +  (l-X„)dG(u)  =  1  , 


(3.3.17) 


-  •••*•.* -.v.LsV.- v v. 


and  since  DF  and  DG  are  nondecreasing,  dF  and  dG  are  nonnegative. 
Hence,  we  immediately  have 
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jdF ( u  > |  <  X^1  0<uSl  , 


(3.3.18) 


and 


|cL(u)j  <  (l-X,,)'1  0<u<l  .  (3.3.19) 

Also,  since 

|HQh(u)-u|  <  N"1  0<u<l  ,  (3.3.20) 

we  have 

|R(u)|  *  X^N'i/2  0Su<l  .  (3.3.21) 

Next  define 

7(u)  =  Mu)  =  (l-X){X'1/2dG(u)oF(DF(u)) 

(3.3.22) 

-(l-X)‘1/2dF(u)cG(DG(u) )}  ,  0<u<l  , 

and  observe  that  7  is  the  natural  limit  of  7  whenever  convergence 
occurs.  Using  triangle  inequality  techniques  (for  the  metrics 
involved)  and  the  bounds  in  (3.3.18)-(3.3.21)  above,  the  proof  of  the 
convergence  of  7  to  7  is  essentially  reduced  to  showing  the 
appropriate  convergences  of  dF(u)  and  dG(u)  to  dF(u)  and  dG(u) 
(resp.)  and  of  aF(FQH(u))  and  aG(GQH(u))  to  aF(DF(u) )  and  aG(DG(u)) 
(resp.).  Pyke  and  Shcrack  (1968)  (for  the  independence  case)  and 
Fears  and  Mehra  (1974)  (for  the  dependence  case)  provide  the  detailed 


proofs  of  these  convergences.  Consequently,  we  have 


7 


(3.3.23) 


which  implies 


6  A*  6 


(3.3.24) 


Observe  that  5  is  a  zero  mean  Gaussian  process  with  covariance  kernel 


K6(u,v)  =  Cov[6(u) , 6(v) ] 


-  (l-X)2{X_1dG(u)dG(v)Ka^(DF(u) ,DF(v) )  (3.3.25) 


+  (l-X)_1dF(u)dF(v)Ka  (Dg(u),Dg(v))} 

G 


3.4  The  Asymptotic  Distribution  of  TN(J) 


Recall  that 


T„(J>  =  /J(7j*TTTu)dD(u)  * 


(3.4.1) 


and  observe  that  the  assumed  continuity  of  J  implies  the  asymptotic 
equivalence  of  TN(J)  and 


T„(J)  =  /J(u)dD(u)  . 
N  o 


(3.4.2) 


Next  define 


\(J)  =  N1/z{T*(J)-/J(u)dD(u)} 


(3.4.3) 


=  JJ(u)d6(u) 

c 


and  observe  that 


3 


implies 


(3.4.4) 


A„(J)  &-*  A(J)  =  /J(u)d6(u)  .  (3.4.5) 

N  o 

Further,  A(J)  is  N(0,<jj),  where 

2  11 

o,  =  / /J(u)J(v)dKt(u,v)  .  (3.4.6) 

J  00  0 

One  can  also  consider  the  joint  convergence  of  (A,,^) , . . .  ,A„(  Jp) )  to 
(A(J1), . . . , A(Jp) ) ,  where  the  latter  is  a  zero  mean  p-variate  normal 
vector  with  covariance  matrix 

ors  =  Cov(A( Jr) ,A( Js) ) ) 


(3.4.7) 

«  //Jr(u)Js(v)dX6(u,v)  . 
oo  r  s  6 

A  careful  evaluation  of  the  last  integral  leads  to  the  expression 

°rs  *  Cl(Jr»Js>  +  C3(Jr,Js)  "  C2(Jr,Js)  +  C4(Jr,Js)  ,  (3.4.8) 

where  the  terms  on  the  right  hand  are  defined  by  (3.4.9)-(3.4.19) : 
C^J^Jj)  =  (l-X)2JJr<u)J,(u)[X'ld2<u)dF(u) 

+  (l-X)-1dF2(uMG(u)  ]du  , 


(3.4.9) 
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C,(Jr,Js)  =  ( 1-X)2J JJr(u) Js(v) 

3  r  s  oc  r 

{X"1[A1(u,v;G,G,F)  +  A2(u,v;G,G,F)] 

•*-(1-X)’1[A1(u,v;F,F,G)  +  A2(u,v;F,F,G)]}  dudv  (3.4.10) 

where 

A1(u,v;G,G,F)  =  dG(u)dG(v)DF(uAV)  ,  (3.4.11) 

A2(u,v;G,G,F)  =  dG(uvv)dG(uAv)dF(uAv)  ,  (3.4.12) 

C2(Jr,Js)  =  (1-X)2[X_1A3( Jr#G,F)A3( JS,G,F) 

+  (l-X)'1A3(Jr,F,G)A3(Js,F,G)]  (3.4.13) 

where 

A3(Jr,G,F)  =  /Jr(u){dG(u)DF(u)} 'du  ,  (3.4.14) 

C4(Jr,Js)  =  (l-X)2//Jr(u)Js(v) 

{X‘1[A4(u,v;G,F)  +  A5(u,v;G,F) 

+  Ae(u,v;G,F)  +  A7(u,v;G,F)] 

+  (l-X)"1[A4(u,v;F,G)  +  A5(u,v;F,G) 

+  A6(u,v;F,G)  +  A7(u,v;F,G)]} dudv  (3.4.15) 


A4(u,v;G,F)  =  dG(u)dG(v)KF(DF(u)fDF(v))  , 


(3.4.16) 


As(u,v;G,F)  =  dG(u)dG(v)dF(u)K^2)(DF(u),DF(v))  ,  (3.4.17) 

A6(u,v;G,F)  =  dG(u)dG(v)dF(u)Kl1)  (DF(u)  ,DF(v) )  ,  (3.4.18) 

A7(u,v;G,F)  =  dG(u)dG(v)dF(u)dF(v) 

•2Z[bk(DF(u),DF(v) ;F)-1]  (3.4.19) 

and  K(1)  and  K<21  are  the  partial  derivatives  of  K  w.r.t  the  1st  and 

2nd  arguments,  respectively.  Recall  that  bk  is  the  dependence 
density  function  corresponding  to  the  dependence  DF  Bk .  Clearly, 
expression  (3.4.8)  is  not  very  easy  to  work  with  (although  it  is 
completely  symmetric  w.r.t  F  and  G) . 

Under  the  null  hypothesis,  H0 :F=G,  we  have 

DF(u)  =  Dg(u)  =  u  , 

dF (u)  =  dG(u)  =  1  ,  (3.4.20) 

dF(u)  =  dG(u)  =  0  . 

Consequently,  the  expression  for  orc  reduces  to 
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°rs  =  ^V(Jr'Js) 

+  2(l-X)2Z[X"\(Jr,Js,F)  +  (l~X)_1Vk(Jr,  JS,G)]  (3.4.21) 

where 

V(Jr,Js)  =  JJr(u)J.(u)du  -  JJr(u)duJJ.(u)du 

rs  o  r  5  0  0 

=  COv[Jr(F(X(i))),Js(F(X(i)))] 

=  Cov[Jr(G(y(j))),Js(G(Y(j)))]  (3.4.22) 

vk(JrfJs;F)  =  }/Jr(u)Js(v)[bk(u,v;F)-l]dudv 

=  Cov[Jr(F(X(i))),Js(F(X(i+k)))]  .  (3.4.23) 

Consequently,  rearranging  terms,  we  have 

°rs  a  i^{a-X)[v(Jr,Js)  ♦  2ZVk(Jr,Js;F)] 

+  X[V(Jr,Js)  +  2ZVk(Jr,Js;G)]}  .  (3.4.24) 

For  the  case  r=s,  this  can  be  reduced  even  more.  We  then  have 
°j  =  iyiVar(J(U)){(l-X)fl  +  2Zp(k; J,F) ] 

+  X[1  +  2Zp(k; J,G) ]} 

l 

=  ^Var(J(U))[(l-X)f(0;J,F)  +  Xf(0;J,G)]  ,  (3.4.25) 

l  ,  l  , 

where  Var(J(U>)  =  /Jz(u)du  -  [/J(u)du]2  , 
o  o 

p(*;J,F)  is  the  correlation  function  of  the  time  series 
{ J(F(X(i) ) ) :ieZj , 

f(-;J,F)  is  the  spectral  density  function  that  corresponds  to 
F),  i.e 


f (u; J,F)  =  l+2Ip(k;  J,F)cos(2jruk)  ,  0<u<l  ,  (3.4.26) 

l 

and  p(-;J,G)  ,  f(*;J,G)  are  the  corresponding  quantities  for 
the  time  series  {j(G(¥( j))): j«Z} . 

Remarks  : 

1)  The  terms  Cx,  C2  and  C3  in  the  general  expression  for  ors  are 

xactly  the  same  under  independence,  while  the  last  term,  C4,  is 
contributed  by  the  dependence  structure  of  the  two  processes  {x} 
and  {Y }  .  This  term  vanishes  under  independence  since  then 
Kf  =  Kg  =  0  . 

2)  Under  H0,  C3  vanishes,  C,  reduces  to 

i^/Jr(u)Js(u)du  (3.4.27) 

and  C2  reduces  to 

i  > 1  1 

iTi/J.(u)dulJ.(u)du  (3.4.28) 

so  that  V(Jr,Js'  is  the  covariance  under  independence  while  the 
last  term,  VKuTr,Js,->,  is  contributed  by  the  dependence 
structures. 

3)  Note  that  bk(*,-;F)  is  the  joint  density  of  the  RVs  F(X(i))  and 
F(X(i+k))  (any  i«Z),  and  1  is  the  product  of  the  marginal 
densities  of  these  two  RVs  and  hence 

n 

/  /Jr(u)  J.(v't  [b,.(u,v;F)-l3dudv 
00 


=  Cov[Jr<F(X(i))),Jf(F(X(i+k>)>]  . 


(3.4.29) 
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3.5  Interpretations 

The  formula  for  the  asymptotic  variance  of  TN(J),  obtained  in 
the  last  section,  is  of  particular  interest  since  it  emphasizes  and 
isolates  the  effect  of  dependence  and,  moreover,  it  expresses  that 
effect  in  an  interpretable  and  estimable  form.  The  first  thing  to 
observe  is  that  the  term 

=  nrVar[J(U)]  (3.5.1) 

is  the  variance  under  independence,  so  that  the  multiplicative  term 
on  the  right,  i.e 

Op(J)  =  t ( 1-X)f (0; J, F)+Xf (0; J,G) ]  (3.5.2) 

is  the  effect  of  the  dependence  structures  of  {x}  and  {y}  .  Next 
observe  that  this  term  is  a  weighted  average  of  the  effects  of  each 
of  the  time  series  {x}  and  {y},  while  these  effects  are  expressed  as 
the  values  at  zero  of  the  spectral  density  functions  of  the  time 
series  {J(F(X))}  and  {J(G(Y))}.  The  fact  that  the  dependence  effect 
of  each  series  is  summarized  in  a  single  value  (as  opposed  to  being 
expressed  as  an  infinite  sum)  makes  it  easier  to  analyze  that  effect 
both  qualitatively  and  quantitatively.  Observe  that  whether  the 
dependence  effect  is  greater  than  or  smaller  than  1  determines 

whether  it  increases  or  decreases  the  variance  of  the  statistic 

relative  to  independence.  Now,  the  value  of  a  spectral  density 
function  at  0  has  a  special  meaning.  Since 

f (0)  =  1  +  2Zp(k)  , 
l 


(3.5.3) 


where  f  is  the  spectral  density  function  of  a  time  series  with 

cc 

correlation  function  p(*),  f(0)>l  occurs  when  Ip(k)  >  0  and  this  last 
condition  can  be  viewed  as  a  kind  of  positive  dependence  in  the  time 
series.  Similarly,  f(0)<l  can  be  viewed  as  a  kind  of  negative 
dependence.  When  T„(J)  is  used  to  test  the  hypothesis  H0:F*G  versus 
the  general  alternative  H-^F^G,  a  critical  region  of  the  form 
|An(J)  |>cn(c)  (with  D(u)=u  in  Ajj(J))  is  usually  used  with 
cN(a)— *a0( J)Q,,,(l-2a) .  If  dependence  is  present  and  is  not  taken  into 
account,  it  can  affect  the  size  of  the  test.  If  o£(J)>l  then  the 
test  will  reject  too  often  (i.e  actual  a  >  nominal  a)  while  the 
opposite  will  occur  when  Cp<l. 

Remark'.  The  issue  of  positive  dependence  has  been  discussed  in  the 
literature.  One  of  the  common  definitions  of  positive  dependence  is 
the  following.  The  RVs  X  and  Y  are  said  to  be  positively  dependent 
if 

E[h(X)h(Y) ]  >  0  for  all  h  with  E|h(X)h(Y)|  <  ».  (3.5.4) 

Gleser  and  Moore  (1983)  applied  this  condition  to  any  pair 
(X(i),X(j))  of  a  stationary  time  series  and  showed  that  tests  of  fit 
may  reject  the  null  hypothesis  of  fit  too  often  under  dependence  when 
independence  is  assumed.  This  is  shown  to  hold  for  wide  range  of 
tests  including  chi-squared  type  tests  and  tests  based  on  the  EOF. 
We  want  to  mention  here  that  their  condition  is  much  stronger  than 
the  condition  f(0;X)>l.  First,  the  condition  E[h(X(l))h(X(l+k))]>0 
for  any  function  h  is  stronger  than  px(k)£0  (although  the  two 
conditions  coincide  for  Gaussian  processes)  and  secondly,  p}.(k)20  for 
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all  k  is  much  stronger  than  lpx(k)£0.  Moreover,  their  conclusion, 
being  completely  qualitative,  is  weaker  than  the  one  given  here  which 
expresses  the  effect  of  the  dependence  explicitly  and  even  suggests  a 
way  to  estimate  it.  We  should  mention,  however,  that  the  result  here 
is  applicable  only  for  one  specific  class  of  statistics,  namely,  the 
linear  rank  statistics  for  the  two-sample  problem,  while  their 
results  and  methods  are  applicable  for  a  much  wider  range  of 
statistics. 

3.6  Estimation 

In  the  previous  section  we  have  discussed  the  effect  of 
dependence  on  various  test  statistics  when  the  tests  are  formed 
assuming  independence.  Here  we  are  going  to  discuss  a  solution  to 
the  problem,  i.e,  what  can  be  done  when  one  is  willing  to  take  into 
account  the  dependence  structure.  The  first  thing  to  observe  is  that 
the  asymptotic  normality  of  the  test  statistic  continues  to  hold 
(under  the  appropriate  conditions  of  smoothness  and  mixing),  so  that 
the  only  other  thing  needed  in  order  to  determine  a  critical  region 
is  an  estimable  expression  for  the  asymptotic  variance  of  the  test 
statistic.  The  formula  for  the  variance  discussed  here  is  almost 
directly  estimable.  Since  procedures  for  spectral  density  estimation 
are  easily  available,  the  only  problem  is  that  the  time  series 
{j(F(X( j) ) ) :j«z}  and  {j(G(Y(i))):i«Z}  are  not  observable  since  the 
functions  F  and  G  are  unknown  (note  that  the  function  J  is 
completely  specified  for  any  specific  statistic  at  hand).  However, 
we  have  a  natural  estimator  for  a  DF,  namely,  the  sample  DF,  so  we 


can  take  the  observed  time  series  {F(X( j) ) : j=l, . . .  ,m}  and 
{G(Y(i) ) :i=l, . • . ,n}  as  estimators  of  finite  realizations  from  the 
time  series  {F(X( j) ) : j«z}  and  {G( Y(i) ) :i«z}  respectively.  Note  that 


by  the  Glivenko-Cantelli  Lemma, 

max{ | F< X( j ) >  — F < X( j ) ) J : j=l, . . .,m}  <  sup{ |F(x)-F(x) | :xefi}  ^4  o  (3.6.1) 

(with  similar  result  for  Y)  so  that  the  above  estimating  samples  do 

make  sense.  (Note  that  although  the  Glivenko-Cantelli  Lemma  was 

originally  proved  for  independent  RVs  it  is  easy  to  show  that  it  also 

00 

holds  for  ^-mixing,  stationary  time  series  with  I0(k)<®) . 

l 

Remark'.  If  we  denote  by  Rx(j)  the  rank  of  X(j)  within  the  X-sample 
and,  similarly,  by  Ry(i)  the  rank  of  Y(i)  within  the  Y-sample, 
then  we  have 

F(X(j>)  =  Rx(j)/m  and  G(Y(i) )  =  RY(i)/n  .  (3.6.2) 

Consequently,  we  call  the  time  series  {F(X(j))l  and  {G(Y(i) )}  the 

rank  transform  of  the  time  series  {X(j>}  and  {Y(i)}  respectively. 
These  rank  transform  time  series  appear  to  be  very  interesting  in 

their  own  right.  In  particular,  the  relationships  between  the 


original  time  series  and  its  rank  transform  version  are  unknown  yet. 
We  will  try  to  explore  some  of  those  relationships  in  the  next 
chapter . 
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3.7  Applications 

Suppose  that  the  processes  {X(i):i«z}  and  {Y(j):j«z}  are 
Gaussian  time  series  that  satisfy  the  conditions  for  the  convergence 
of  I  to  6.  Let  px  and  pY  be  the  correlation  functions  of  {x}  and  {y} 
respectively.  Then  the  dependence  densities  b^  • ,  * ;F)  and  bk(*,*;G) 
have  the  same  general  form  given  by 

»  (l-px(k))~1/2exp{-.5(l-px(k))_1 

(3.7.1) 

•tPx(k)Q|(u)+p2(k)Q|(v)-2px(k)Q*(u)Q+(v)]} 

where  is  the  quantile  function  of  a  11(0,1)  RV. 

We  will  now  try  to  evaluate  the  asymptotic  variance  of  some 
common  linear  rank  statistics. 

a)  The  Wilcoxon  Statistic,  W  . 

For  the  Wilcoxon  statistic  we  have 

Jw(u)  =  u  .  (3.7.2) 

Hence, 

Var [JW(U) ]  =  1/12  .  (3.7.3) 

Next  we  need  to  evaluate  p(k;Jw,4>).  We  have 

p(k;Jw,$)  =  12Cov[4>(Z1)  ,$(Z2) ]  =  12E[«J>(Z1)*(Z2) ]  -  3  ,  (3.7.4) 


where  Zx  and  Z2  are  N(0,1)  RVs  with  correlation  coefficient  px(k) 
(for  {x} )  or  pY(k)  (for  {y}).  In  what  follows  we  are  using  results 


given  in  Owen  (1980)  to  evaluate  integrals  for  the  normal 


distribution.  For  simplicity,  we  will  denote  the  correlation  between 
Z1  and  Z2  simply  by  p.  We  have 

E[4>(Z1)4>(22)]  =  E{4>(Z1)E[4>(Z2)|Z1]}  .  (3.7.5) 

Now,  Z2[z1=z  is  a  N(pz,l-p2)  RV.  Hence  (using  eq.  10,100.8  p.  403 
Owen  (1980)), 

E[*(Z2)I  Zj-z]  =  /4>(t)d4>[(t-fiz)/(l-fi2)1/2] 

=  f<i>[pz+(l-p2)1/2t]d<i>(t) 

=  4>[pz/ (2-p2)172]  .  (3.7.6) 

So  (using  eqs.  2,010.6  and  2,010.7  p.  400  0wen(1980))f 
E[4>(Z1>4>(Z2)]  =  /4>(t)4>[pt/(l-p2)1/2]d*(t) 

=  1/4  +  (l/2jr)tan-1[p/  (4-p2)1/2] 

=  1/4  +  (l/2»)sin_1(p/2)  .  (3.7.7) 


Hence 


p(k;Jw,$)  =  (6/ jr)sin'1  (p/2)  ,  (3.7.8) 

where,  again,  p  represents  either  px(k)  or  pY(k).  Consequently,  we 
have 

Var  [A(  Jw)  ]  =  ^(l/12){(l-X)[l+(12/>r)Esin'1(px(k)/2)] 

+  X[  l  +  (  12/ir)Zsin"1  (py  (k)/2)  ]}  .  (3.7.9) 


b)  The  Normal  Score  Test,  N  . 


In  this  case  JN(u)  =  Q$(u) .  Hence 


J„(F(X(i) ) )  =  X(i)  and  J„(G(Y(j)))  =  Y(j)  . 
So,  we  immediately  have 

Var [A( JN) ]  =  ^t(l-X)f(0;X)  +  X£(0;Y)]  . 


c)  The  Median  Test,  M  . 


In  this  case  JM(u)  =  sgn(u-.5).  Hence 


Var [JM(U>]  =  1  , 


and 

p(k;JM,F)  =  Corr [ JM(F(X(i) ) ) , JM(F(X(i+k) ) ) ] 


=  E{sgn[F(X(i))-.5]sgn[F(X(i+k))-.5]} 

=  E[sgn(X(i)-raed(X))sgn(X(i+k)-med(X) ) ] 

=  Pr [(X( i)-raed(X) ) (X(i+k)-med(X) )  >  0] 

-  Pr[(X(i)-med(X))(X(i+k>-med(X))  <  0] 

=  2Pr [ (X(i)-med(X) ) <X( i+k)-med(X) )  >  0]  -  1 
=  2{Pr[X(i)  <  med(X),X(i+k)  <  med(X)] 

+  Pr[X(i)  >  med(X),X(i+k)  >  med(X)]}  -  1 
where  med(X)=QF(  .5)  is  the  median  of  X.  Note  that  this  is  a- 


(3.7.10) 

(3.7.11) 

(3.7.12) 


(3.7.13) 

familiar 


measure  of  association,  it  is  usually  denoted  by  q  and  is  discussed 
in  Blomqvist  (1950).  For  jointly  normal  RVs,  med(X)=E(X(i) )  and, 
also,  the  two  probabilities  on  the  last  expression  above  are  equal. 
Hence, 


p(k; Jm,F)  =  4Pr(Z1<0,Z2<0)  -  1  ,  (3.7.14) 

where  (ZlfZ2)  is  a  standard  bivariate  normal  RV  with  correlation 
coefficient  px(k).  Hence  (using  eq.  3.5  p.  416  and  eq.  2.2  p.  414 
Owen  (1980)), 

p(k;J„,F)  =  4  {1/2  -  (l/?r)  tan'1  [  ( (1-p)/  ( 1+p)  )1/2]}  -  1 
=  1  -  ( 4/ jt)  tan'1  [  ( (1-p)/ (1+p) ) 1/z] 

=  ( 2/ir)sin~l(p )  .  (3.7.15) 


Consequently, 

Var  [A(  JM;  ]  =  ^{(1-X)[1  +  (4/ff)Zsin_1(px(k) )  ] 

+  Ml  +  (4/r)Esin"1(pY(k))  ]}  .  (3.7.16) 

3.8  Empirical  Examples 

In  this  section  we  are  going  to  analyze  briefly  two  data  sets, 
applying  some  of  the  procedures  discussed  above.  The  first  data  set 
consists  of  two  samples  of  size  32  each.  It  represents  the 
reflectance  of  energy  from  a  mineral  in  32  disjoint  bands  of 
frequency.  Each  sample  represents  a  different  mineral  and  the 


question  is  whether  these  two  minerals  differ  in  the  way  they  reflect 
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energy.  Clearly,  the  two  samples  are  independent,  but  the 
observations  within  each  sample  are  probably  not  independent. 

Remark :  There  is  always  the  general  problem  of  applying  asymptotic 
results  to  finite  samples.  Practically,  almost  any  sample  can  be 
exposed  to  the  computer  programs  that  carry  out  the  computations  of  a 
statistical  procedure.  The  question  is  how  to  interpret  the  results. 
Our  approach  is  to  use  the  statistical  procedures  to  learn  about  the 
data  and,  at  the  same  time,  to  use  the  data  to  learn  about  the 
behavior  of  the  statistical  procedure.  Consequently,  when  we  have  a 
small  sample,  we  use  the  statistical  procedure  as  an  exploratory  tool 
and  do  not  draw  definite  conclusions. 

The  analysis  of  the  second  data  set  demonstrates  an  application 
of  the  two-sample  procedure  in  a  time  series  situation  which  can  be 
regarded  as  a  check  for  stationarity  of  the  time  series.  To  test 
whether  the  marginal  distributions  of  the  components  of  a  time  series 
remain  unchanged  as  time  changes,  one  approach  is  to  take  the  two  end 
parts  of  the  time  series  and  expose  them  to  the  two-sample  procedure. 
Since  the  observations  within  each  sample  are  not  independent,  the 
effect  of  dependence  needs  to  be  taken  into  account.  Note  that  the 
statistical  procedure  described  here  is  not  strictly  adequate  for 
this  application  since  the  assumed  independence  of  the  two  samples  is 
not  completely  satisfied.  However,  when  the  two  samples  come  from 
far  apart  portions  of  the  time  series,  we  can  still  use  our 
exploratory  approach  to  gain  insight  into  the  problem.  The  remark 
above  about  applying  asymptotic  results  to  finite  samples  applies 


also  here. 


The  second  data  set  we  analyze  here  was  formed  from  the  well- 
known  International  Airlines  time  series.  Since  this  time  series  has 
a  clearly  recognized  trend  and  twelve  month  cycle  in  it  we  follow  the 
commonly  used  practice  and  transform  the  data  by  taking  the  12th 
difference  of  the  log  of  the  original  data.  The  resulting  time 
series  has  132  observations,  so  we  formed  the  two  samples  by  taking 
the  first  and  last  thirds.  This  gives  two  samples  of  size  44  each 
which  we  analyze  as  our  second  example  below. 

a)  Computational  Formulas 

Given  two  samples  X(l), — ,X(m)  and  Y(l) , . . . ,Y(n) ,  we  first 
order  each  of  them  and  denote  by  X(l:m) ,  . . .  ,X(m:m)  and 
Y(l:n) , . . . , Y(n:n)  the  resulting  order  statistics  vectors.  Then  the 
ranks  are  calculated  by 

Rj  *  j+  Z  l[Y(i:n)£X( j :m) ]  ,  j=l,...,m  .  (3.8.1) 

1  i  =  l 

Next,  the  comparison  DF  D(u)  is  represented  by  an  m+2  ordinate  vector 
Dj  and  by  an  m+2  abscissa  vector  defined  by 

Dj  =  (j-l)/m  ,  j=l,...,m+l 

Dd+2  =  1  (3.8.2) 

Uj+1  —  R./N  ,  j=0, ... ,m+l 

where  N=m+n,  R0=0  and  Rm+1=N.  Then  we  plot  versus  joining  the 
points  by  straight  lines.  Next  we  calculate  three  linear  rank 


statistics  of  the  form 


T„(J)  =  (1/m)  Z  J(R1/(K+1))  ,  (3.8.3) 

"  j«i  J 

where 

J(u)  =  Jw(u)  =  u  for  the  Wilcoxon  statistic, 

J(u)  =  J„(u)  =  Q^(u)  for  the  normal  score  statistic, 

J(u)  =  JM(u)  =  sgn(u-.5)  for  the  median  statistic. 

Then  we  calculate  the  mean  and  variance  under  independence,  and  under 
the  null  hypothesis,  for  each  of  the  statistics  using  the  following 
formulas . 

E[Tn(J)]  =  j  S  j(^r)  ,  (3.8.4) 

V[T„(J)]  =  ^l*BrrJi{J(!5TT)-E[TN(J)]}2  .  (3.8.5) 

Note  that 

l 

E[TN(J)]  — *  «(J)  =  ;j(u)du  and 

NV[TN(J)]  — *  o2  =  -4^J[J(u)-tf(J)]2du  as  N  — ♦  ®  . 

Next  we  normalize  each  statistic  by 

Tn*(J)  =  {TN(J)-E[T!)(J)]}/{V[TN(J)]}1/Z  .  (3.8.6) 

Note  that  under  independence  (and  under  the  null  hypothesis > 
Tn*(J)  N(0, 1) .  Next  we  estimate  the  effect  of  the  dependence 

structure  within  each  sample  on  the  asymptotic  variance  of  T„*(J). 
This  is  done  by  estimating  the  spectral  density  of  the  time  series 


{j(F(X( j) ) )}  and  {j(G(Y(i)))}  for  each  of  the  score  functions  Jw,  JN 
and  JM.  The  formulas  for  the  spectral  estimation  procedures  are 
described  in  the  next  chapter.  After  having  estimates  of  f(0;J,F) 
and  f(0;J,G),  we  calculate 

a^(J)  =  (l-X)f (0;J,F)+Xf (0;J,G)  (3.8.7) 

where  X=m/N.  Finally,  we  let  TND(J)=TH’( J)/oD(  J) ,  and  check  whether 
|TND(J)  |>Q+(l-a/2)  in  order  to  approximate  a  level  a  test  for  the 
null  hypothesis. 

b)  The  Results 

The  plot  of  6^  for  the  reflectance  data  is  presented  in  Figure 
1.  A  diagonal  line  representing  the  function  D(u)=u  has  been  added 
to  the  plot  to  help  comparing  D(u)  to  its  theoretical  value  under  the 
null  hypothesis.  This  plot  suggests  that  the  two  populations  are 
different,  the  first  one  being  stochastically  smaller.  We  infer  this 
from  the  fact  that  D(u)>D(u)  0<uSl.  Table  1  presents  the  numerical 
results  for  the  three  linear  rank  statistics:  the  Wilcoxon,  the 
normal  score  and  the  median.  The  symbols  at  the  top  of  each  column 
in  the  table  are  those  defined  in  the  previous  subsection.  The 
values  for  T„  (J)  show  that  under  independence  the  null  hypothesis 
can  be  rejected  with  p-vaiue  less  than  .001  .  However,  when 
dependence  is  taken  into  account,  the  values  for  TND(J)  show  that  the 
null  hypothesis  can  no  longer  be  rejected  so  safely  and  it  might  be 
that  the  autocorrelations  within  each  sample  caused  the  two  samples 


to  look  so  different. 


The  results  for  the  second  data  set,  the  International  Airlines 


data,  lead  to  similar  conclusions.  The  plot  of  D. ,  presented  in 
figure  2,  suggests  that  the  two  samples  are  different,  the  first  one 
being  stochastically  larger.  Table  2  presents  the  numerical  results 
for  the  three  linear  rank  statistics.  The  values  of  TN*(J)  indicate 
that  under  independence  the  null  hypothesis  can  be  rejected  with  p- 
value  less  than  .001  .  But,  again,  the  autocorrelations  within  each 
sample  increase  the  p-value  to  around  .05,  so  we  have  to  be  more 
careful  in  rejecting  the  null  hypothesis. 


Table  1. 
Results. 

REFLECTANCE  DATA. 

Linear 

Rank  Statistics  - 

Numerical 

Statistic 

Tn(J) 

T„*(J) 

f<0;J, 

F)  f (0; J,G) 

TnD(j) 

Wilcoxon 

.319 

-5.049 

6.295 

5.427 

5.861 

-2.086 

Normal 

-.595 

-4.997 

6.011 

5.278 

5.644 

-2.103 

Median 

-.500 

-3.969 

6.032 

4.784 

5.408 

-1.707 

Table  2.  INTERNATIONAL  AIRLINES  DATA.  Linear  Rank  Statistics 
-  Numerical  Results. 


Statistic 

VJ> 

Th.*(J) 

f(0;J,F)  f (0; J,G) 

*d<J> 

Tnd(J) 

Wilcoxon 

.635 

4.415 

2.950 

4.819 

3.885 

2.240 

Normal 

.460 

4.484 

3.023 

4.864 

3.944 

2.258 

Median 

.364 

3.392 

2.811 

3.413 

3.112 

1.923 
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CHAPTER  IV 

RANK  TRANSFORM  SPECTRUM  :  EMPIRICAL  TECHNIQUES 

4.1  Introduction 

For  a  stationary  time  series  {Y(j):j«z}  with  a  marginal  DF  F,  we 
define  the  probability  integral  transform  time  series  {uY(j):j«z}  by 

UY(  j)  =  F(Y(j))  ,  jtZ  .  (4.1.1) 

For  a  finite  realization  Y(l), . . . ,  Y(N)  from  {y},  we  define  the  rank 
transform  time  series  UY(j),  j=l,...,N  by 

Uy( j)  =  F(Y(j)>  ,  j=l,...,N  ,  (4.1.2) 

where  F  is  the  EDF  of  the  sample  Y(j),  j=l,...,N.  As  mentioned  in 
the  previous  chapter,  one  can  regard  the  sample  UY(j),  j=l,...,N  as 
an  estimator  of  a  sample  from  the  time  series  {uY(j):jeZ}.  Another 
way  to  look  at  the  rank  transform  time  series  is  as  a  general 
monotone  transformation  of  {y} . 

In  time  series  analysis,  the  theory  and  techniques  used  do  not 
require  assumptions  on  the  marginal  distribution  of  the  RVs  building 
the  series  (except  of  having  finite  second  moments).  However,  the 
effect  of  the  marginal  distribution  on  the  dependence  structure  is 
still  of  interest.  One  way  to  start  gaining  knowledge  on  this  effect 
is  to  expose  the  rank  transform  time  series  to  the  time  series 
analysis  procedure  used  to  analyze  the  original  time  series,  and  then 
compare  the  results.  During  this  research  work,  the  above  procedure 
was  repeated  for  several  time  series.  Our  general  impression  is  that 
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the  shape  of  the  spectral  density  is  invariant  under  the  rank 
transformation  and  probably,  under  a  general  monotone  transformation. 

In  this  chapter,  we  are  going  to  present  these  results  for  two 
time  series:  the  Wolfer  Sunspots  data  and  the  Critical  Radio 
Frequencies  data.  In  section  2  below  we  describe  the  computational 
formulas  that  were  used  to  calculate  the  various  spectral  estimates. 
Section  3  contains  the  plots  of  these  spectral  estimates. 

4.2  Computational  Formulas 

Let  Y(  1) , . . . , Y(N)  be  the  observed  sample,  and  let  Q(u)  be  the 
sample  quantile  function  calculated  from  that  sample.  Then,  before 
entering  the  spectral  analysis  procedure,  we  standardize  the  sample 
by  subtracting  the  median  and  dividing  by  twice  the  interquartile 
range,  i.e,  we  take 

Y*(j)  =  [Y(j)-Q(.5)]/t2(Q(.75)-Q(.25))J  ,  j=l,...,N  .  (4.2.1) 

For  convenience,  we  will  keep  denoting  by  Y(j)  the  standardized 
version  defined  above.  In  addition  to  N,  the  sample  size,  we  define 
two  other  vector  sizes,  denoted  by  NC  and  NF.  NC  is  the  number  of 
lags  for  which  correlations  are  calculated  and  is  taken  as 
min{250, (N/2)} .  NF  is  the  number  of  equally  spaced  frequencies  at 
which  spectral  quantities  are  calculated.  This  number  is  chosen  as 
the  smallest  integer  of  the  form  2k3m5n  which  is  greater  than  or 
equal  N+NC.  This  choice  enable  us  to  calculate  covariances  by 
inverting  the  periodogram  through  the  Discrete  Fourier  Transform 
( DFT ) ,  and  also  it  makes  the  DFT  routine  work  most  efficiently. 


First  we  calculate  the  unnormalized  sample  periodogram,  f‘,  by 

f‘(<j  )  =  (1/M)  |  E  Y(j)exp[2»ri(  j-l)u_]|2  ,  (4.2.2) 

r  j»l  r 

where  up=(p-l)/NF  ,  p=l,...,NF  ,  i=(-l)1/2  ,  and  |z|2  is  the  squared 
modulus  of  a  complex  number  z.  The  covariance  function,  R(*),  is 
then  calculated  by  taking  the  inverse  DFT  of  the  periodogram,  i.e, 
nf  . 

R(k)  =  (1/NF)  I  fN(<j  )exp[-2jrikuc]  ,  k=0, NC  .  (4.2.3) 

Then  we  normalize  the  periodogram  by 

fN(wp)  =  f*(up)/R(0)  ,  (4.2.4) 

and  calculate  the  correlation  function,  p(*),  by 

p(k)  =  R(k)/R(0)  ,  k*0,l, . . . ,NC  .  (4.2.5) 

As  can  be  seen  above,  we  define  spectral  functions  on  the  unit 
interval,  so  that  they  are  even  about  w=0.5.  Also,  to  examine 
spectral  functions  we  plot  their  logarithm,  using  a  fixed  scale  axes 
with  the  abcsissa  (representing  the  frequencies)  running  from  0  to 
0.5  and  the  ordinate  (representing  the  log  spectra)  running  from  -6.0 
to  +6.0  .  The  next  step  of  the  analysis  is  to  plot  the  sample 
periodogram  function,  fN.  Next  we  calculate  a  spectral  window 
smoothed  periodogram  using  the  Parzen  window.  The  computational 
formulas  for  this  estimator  are  as  follows.  The  Parzen  window 
function,  w(k),  is  defined  by 


(4.2.6) 


r  [l-6((k/NC)  -(k/NC)3)]  ,  k=0,l, ...,NC/2 

w(k)  =  S 

1  2[l-(k/NC)]3  ,  k=NC/2, . . .,NC  . 

Then  the  smoothed  periodogram,  fw,  is  calculated  by 

NC 

fH(Op)  =  l+2^I^w(k)p(k)cos(2jrkwp)  .  (4.2.7) 

This  function  is  then  plotted  using  the  standard  spectral  plot 
described  above.  The  last  spectral  estimator  presented  here  is  the 
autoregressive  spectral  density  using  the  CAT  order  determination 
function.  The  computational  formulas  for  this  procedure  are  given 
below.  First,  the  correlation  function,  p(*),  is  used  to  calculate 
the  partial  autocorrelation  function,  pac(»),  by  solving  the  Yule- 
Walker  equations  successively  for  orders  1,...,NC.  For  order  k,  the 
Yule-Walker  equations  are  given  by 

v*  =  j2Qak(j)p(k) 

and  (4.2.8) 

*  i  i 

0=  Zak(3)p(| j-m|)  ,  m=l, . . . ,k 

where  ak(0)=l  .  These  equations  are  solved  for  ak(l) , . . . ,ak(k)  and 
for  Vk  using  the  given  correlations  p(0) , p( 1) , . . . , p(k) .  The  ak's  are 
the  coefficients  of  the  AR(k)  model  with  these  first  k  correlations 
and  Vk  is  the  residual  variance  for  that  model.  After  solving  the 
equations  for  orders  1,...,NC,  the  partial  autocorrelations  are  given 
by 

pac(k)  =  ak(k)  ,  k=l,...,NC  .  (4.2.9) 


The  Yule-Walker  equations  are  solved  for  successive  orders  using  the 
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following  recursive  algorithm. 

a0(0)  *  1 

v0  =  p(  0)  =  1 

ak(0)  =  1 

ak<k)  ■  ^oaK-i(i)P<k-i>/vk-i 

ak(j)  "  ak_1( j)+ak(k)alt_1(k-j)  ,  j=l,...,k-l 

vk  =  j|0ak<3>P(j)  • 

The  partial  autocorrelations  are  then  used  to  calculate  the  residual 
variances  RV( • )  using  the  formula 

RV(j)  «  RV(j-l)[l-pac(j)2]  ,  j=l,...,NC  ,  (4.2.10) 

where  RV(0)=1.  It  should  be  noted  that  RV(j)=V.j,  and  is  regarded  as 
the  more  stable  method  of  computing  the  residual  variances.  The 
order  determination  function,  CAT,  is  then  calculated  from  the 
residual  variances  using  the  formula 


CAT(k)  =  ( 1/N) 

I  i 

-M/N) 

l-(k> 

'Hi 

k*l, . 

. .  ,NC 
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Next,  we  determine  the 

AR  order 

as  the  value  of 

k  which 

minimizes  the 

function  CAT(*)# 

and 

denote 

it 

by 

kl. 

The 

coefficients, 

(!)#•••#  (kl)  t 

of 

the  AR(kl) 

model 

are 

then 

calculated  by 

solving  the  Yule-Walker  equations  for  order  kl.  Finally,  the  AR(kl) 
spectral  density  is  calculated  using  the  formula 
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fAH<V  *  RV(kl)|l+^aki(j)exp[2*ijup]|'2  .  (4.2.12) 
4.3  The  Graphical  Results 

The  results  for  the  first  data  set,  the  Wolfer  Sunspots  data, 
are  presented  in  figures  3  through  5.  Figure  3  shows  the  two 
periodograms  which  seem  to  represent  the  same  spectral  behavior.  The 
smoothed  periodograms  in  figure  4  show  that  most  of  the  variability 
is  concentrated  in  the  frequency  band  0  to  0.2  in  which  the  two 
spectral  estimates  are  very  similar.  By  modeling  the  time  series 
with  an  autoregressive  scheme,  the  resulting  spectral  density 
estimates  are,  again,  very  close  but  not  identical.  The  peak  of  the 
rank  transform  spectra  is  a  little  bit  higher  and  occurs  at  a  lower 
frequency.  The  general  shape  of  the  two  spectral  density  estimates 
is,  however,  the  same. 

Figures  6  through  8  give  the  same  results  for  the  second  data 
set,  the  Critical  Radio  Frequencies  data.  Here  the  results  are  even 
more  convincing.  The  peaks  are  of  the  same  height  and  occur  at  the 
same  frequencies  in  all  three  spectral  estimates,  while  the  shape  in 
general  is  almost  identical. 

We  might  mention  here,  as  a  general  remark,  that  the  results  for 
the  other  time  series  we  have  checked  which  are  not  presented  here 
were  very  similar  to  those  here  and  sometimes  even  more  amazing. 
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Figure  3.  WOLFEP.  SUNSPOT  DATA  -  Raw  periodograin 

Solid  line  -  original  data,  dashed  line  -  rank  transform  data 
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Figure  7.  CRITICAL  RADIO  FREQUENCIES  DATA  -  Smoothed  periodogram 
Solid  line  -  original  data,  dashed  line  -  rank  transform  data 
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Figure  8.  CRITICAL  RADIO  FREQUENCIES  DATA  -  AR  Spectral  density 
Solid  line  -  original  data,  dashed  line  -  rank  transform  data 
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CHAPTER  V 

CONCLUSION 

5.1  Concluding  Remarks 

In  this  dissertation,  the  problem  of  applying  nonparametric 
techniques  to  time  series  data  is  considered.  The  approach  taken 
here  is  to  extend  existing  theoretic  results  by  interpreting  and 
reformulating  expressions  for  variance  which  are  otherwise  not  easy 
to  apply  to  statistical  procedures.  In  particular,  the  asymptotic 
variance  of  linear  rank  statistics  in  the  two  sample  problem,  under 
dependence  within  the  samples,  is  expressed  in  terms  of  the  spectral 
densities  of  the  corresponding  rank  transform  time  series.  This 
result  is  then  used  to  suggest  estimators  of  the  asymptotic  variance 
by  means  of  using  existing  time  series  analysis  procedures  to 
estimate  the  spectral  densities  of  the  rank  transform  time  series. 
Also,  this  expression  of  the  asymptotic  variance  enables  one  to 
better  understand  the  conditions  under  which  the  asymptotic  variance 
under  dependence  is  greater  or  smaller  than  under  independence. 

The  appearance  of  the  rank  transform  spectrum  in  the  expression 
for  the  asymptotic  variance  of  linear  rank  statistics  led  us  to 
examine  the  properties,  in  general,  of  the  rank  transform  spectrum. 
Our  impression,  after  examining  several  time  series,  is  that  the 
spectral  density  is  approximately  invariant  under  rank 
transformations.  A  report  of  this  empirical  investigation  is  given 
in  chapter  4. 


5.2  Problems  for  Further  Study 


The  weak  convergence  of  empirical  processes  formed  from 
dependent  data  is  an  active  area  of  research.  Most  of  the  published 
results  in  this  area  are  for  0-mixing  and  strong  mixing  sequences. 
This  presents  a  problem  for  applying  those  results  in  a  time  series 
situation  since  a  time  series  obeying  an  autoregressive  scheme  does 
not,  in  general,  satisfy  the  required  mixing  conditions.  The  notion 
of  strong  mixing  &s  sequences,  introduced  by  Gastwirth  and  Rubin 
(1975)  solves  the  problem  for  some  first  order  autoregressive 
schemes .  It  seems  to  be  an  open  problem  to  prove  a  result  for  weak 
convergence  of  empirical  processes  that  holds  for  a  general  ARMA(p,q) 
process. 

Another  problem  for  further  research  is  to  find  other  test 
statistics  to  which  one  can  apply  the  approach  used  here  to  express 
the  asymptotic  variance  in  terms  of  the  spectral  density  of  some 
related  time  series. 

Our  impression  from  working  with  small  samples  and  with  long 
memory  time  series  is  that  one  can  gain  valuable  information  by 
applying  asymptotic  techniques  even  when  the  data  doesn't  seem  to 
obey  the  required  assumptions.  It  is  hence  desirable  that  theorems 
be  complemented  by  data  analytic  diagnostics  usable  by  applied 
statisticians. 

In  this  work  we  have  done  a  limited  empirical  investigation  of 
the  properties  of  rank  transform  spectrum.  The  theoretical 
properties,  however,  of  the  rank  transform  spectrum  are  still  an  open 
research  problem. 
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